Skip to content

Fix/perf thread safe skip#161

Open
shrutu0929 wants to merge 3 commits intoRefactron-ai:mainfrom
shrutu0929:fix/perf-thread-safe-skip
Open

Fix/perf thread safe skip#161
shrutu0929 wants to merge 3 commits intoRefactron-ai:mainfrom
shrutu0929:fix/perf-thread-safe-skip

Conversation

@shrutu0929
Copy link
Copy Markdown

@shrutu0929 shrutu0929 commented Apr 9, 2026

solve #154
The latest refinement resolves a critical thread-safety vulnerability within the parallel analysis engine where concurrent worker threads were improperly mutating a shared semantic_skip_warnings list. Under the default threading configuration, these simultaneous append operations risked intermittent data loss or internal state corruption during high-throughput repository scans. We mitigated this by refactoring the ParallelProcessor and Refactron.analyze() pipeline to treat skip warnings as discrete return values rather than shared mutable state. Worker tasks now encapsulate their findings in a thread-local context and return them to the main thread in a secure 3-tuple structure, where they are consolidated only after all parallel operations have safely concluded. This architectural adjustment ensures full concurrency safety without sacrificing performance, guaranteeing consistent and reliable analysis reports regardless of the scale or parallelization depth of the project.

Summary by CodeRabbit

  • Refactor
    • Parallel file processing now collects and returns analysis skip warnings along with results and errors, improving visibility of skipped analyses.
  • Tests
    • Updated tests to expect and validate the new skip-warnings output from parallel processing, ensuring correct behavior across sequential, threaded, and process-based runs.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

📝 Walkthrough

Walkthrough

The pull request updates parallel processing to propagate analysis skip warnings: ParallelProcessor.process_files() and its internal implementations now accept and return a 3-tuple including AnalysisSkipWarning; Refactron.analyze is updated to unpack and aggregate these skip warnings. Tests updated to the new 3-tuple contract.

Changes

Cohort / File(s) Summary
Parallel Processor Skip Warning Integration
refactron/core/parallel.py
Method signatures updated for process_files, _process_sequential, _process_parallel_threads, _process_parallel_processes to accept process_func that returns (Optional[FileMetrics], Optional[FileAnalysisError], Optional[AnalysisSkipWarning]). Aggregation of skip warnings (skips) added across sequential, thread, and process implementations; return tuple now includes skips.
Refactron Analysis Wrapper Update
refactron/core/refactron.py
The in-wrapper process_file_wrapper now returns a 3-tuple (file_metrics, error, skip_warn). Call site updated to unpack (results, errors, skip_warnings) and extend result.semantic_skip_warnings with collected warnings. Removed previous in-wrapper accumulation of skip warnings.
Tests: Contract and Mocks Updated
tests/test_config_management.py, tests/test_performance_optimization.py
Tests adjusted to expect three return values from process_files. Mock process_func/success/error mocks updated to return a third element (None) and assertions extended to check skips (typically empty) where relevant. Mock futures updated accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A trio now hops where a pair once stood,

Metrics, errors, and skips — all for the good.
Through threads and processes they scamper and sing,
Warnings find home in the results that they bring.
🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Fix/perf thread safe skip' is vague and lacks specificity about what is being fixed. While it mentions 'thread safe skip', it does not clearly convey the main change (refactoring parallel processing to return skip warnings as discrete values rather than mutating shared state). The use of 'Fix/perf' is ambiguous regarding whether this is primarily a fix or a performance optimization. Consider a more descriptive title such as 'Refactor parallel processing to return skip warnings as tuple values for thread safety' to clearly communicate the architectural change and its purpose.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@shrutu0929 shrutu0929 force-pushed the fix/perf-thread-safe-skip branch from b516d1f to 00ad3a2 Compare April 9, 2026 14:04
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
refactron/core/refactron.py (1)

272-275: ⚠️ Potential issue | 🔴 Critical

Remove the worker-side append of skip warnings.

Line 274 reintroduces the shared mutable state this PR is trying to eliminate, and in parallel mode each warning is reported twice: once there and again on Line 308 after aggregation.

Suggested fix
                 try:
                     file_metrics, skip_warn = self._analyze_file(file_path)
-                    if skip_warn is not None:
-                        result.semantic_skip_warnings.append(skip_warn)

                     # Update incremental tracker
                     if self.incremental_tracker.enabled:
                         self.incremental_tracker.update_file_state(file_path)

                     return file_metrics, None, skip_warn

Also applies to: 301-308

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/core/refactron.py` around lines 272 - 275, Remove the worker-side
append of skip warnings so shared mutable state isn't modified by workers: in
the code path that calls self._analyze_file (the block that currently does
result.semantic_skip_warnings.append(skip_warn)), stop appending skip_warn to
result.semantic_skip_warnings in the worker; instead return skip_warn to the
caller and let the aggregator (the existing aggregation logic around the later
block that handles semantic_skip_warnings between lines ~301-308) collect and
append warnings once. Ensure _analyze_file still returns skip_warn (or a
consistent sentinel) and remove any duplicate appends from worker threads so
warnings are only added during the centralized aggregation step.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@refactron/core/parallel.py`:
- Around line 58-75: Call sites that unpack process_files must be updated to
handle the new 3-tuple return (metrics, errors, skips). Find occurrences that do
"results, errors = pp.process_files(...)" and change them to "results, errors,
skips = pp.process_files(...)", then update any downstream uses/asserters that
referenced the old two variables to account for the new "skips" list (e.g.
include skip counts in assertions or ignore it with an underscore if not
needed). Ensure all references to the former two-value unpacking are updated to
use the process_files function name and the new third return value.
- Line 10: The file refactron.core.parallel has uncommitted formatter/linter
changes and an overlong signature/docblock (around the block spanning lines
58–75); run and commit black and isort to normalize imports and formatting, then
break/rewrap the long function signature or docstring text in the function or
class defined in that 58–75 region so no line exceeds 100 characters (ensure the
function/method/class name in that block retains its identifiers), then re-run
flake8 to verify the overlong-line error is resolved and commit the formatted
file.

---

Outside diff comments:
In `@refactron/core/refactron.py`:
- Around line 272-275: Remove the worker-side append of skip warnings so shared
mutable state isn't modified by workers: in the code path that calls
self._analyze_file (the block that currently does
result.semantic_skip_warnings.append(skip_warn)), stop appending skip_warn to
result.semantic_skip_warnings in the worker; instead return skip_warn to the
caller and let the aggregator (the existing aggregation logic around the later
block that handles semantic_skip_warnings between lines ~301-308) collect and
append warnings once. Ensure _analyze_file still returns skip_warn (or a
consistent sentinel) and remove any duplicate appends from worker threads so
warnings are only added during the centralized aggregation step.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ccd48783-7b45-400f-b4e2-9dad27537232

📥 Commits

Reviewing files that changed from the base of the PR and between a9659f5 and 9f20bb0.

📒 Files selected for processing (2)
  • refactron/core/parallel.py
  • refactron/core/refactron.py


from refactron.core.analysis_result import FileAnalysisError
from refactron.core.models import FileMetrics
from refactron.core.models import FileMetrics, AnalysisSkipWarning
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Commit the formatter/linter output for this file.

Pre-commit is still red here: black and isort rewrite the file, and flake8 reports an overlong line in this changed signature/doc block.

As per coding guidelines, **/*.py: Use line length of 100 characters, enforced by black, isort, and flake8.

Also applies to: 58-75

🧰 Tools
🪛 GitHub Actions: Pre-commit

[error] pre-commit black hook id failed; files were modified by this hook (reformatted refactron/core/parallel.py). Commit the formatting changes.


[error] pre-commit isort hook id failed; files were modified by this hook (Fixing /home/runner/work/Refactron_lib/Refactron_lib/refactron/core/parallel.py). Commit the import/style changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/core/parallel.py` at line 10, The file refactron.core.parallel has
uncommitted formatter/linter changes and an overlong signature/docblock (around
the block spanning lines 58–75); run and commit black and isort to normalize
imports and formatting, then break/rewrap the long function signature or
docstring text in the function or class defined in that 58–75 region so no line
exceeds 100 characters (ensure the function/method/class name in that block
retains its identifiers), then re-run flake8 to verify the overlong-line error
is resolved and commit the formatted file.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
refactron/core/refactron.py (1)

270-307: ⚠️ Potential issue | 🟡 Minor

Pre-commit blocker: re-run black for this file before merge.

CI already reports formatting changes required, so this PR cannot pass pre-commit as-is.

As per coding guidelines: "Use black formatter with target-version set to py38, py39, py310, py311".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/core/refactron.py` around lines 270 - 307, This file fails the
project's Black formatting check; re-run Black on the module containing the
functions shown (e.g., around _analyze_file and process_file_wrapper which is
passed into ParallelProcessor.process_files) and commit the reformatted result
using the project's Black configuration (ensure target-version includes py38,
py39, py310, py311) so CI/pre-commit passes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_config_management.py`:
- Line 915: The tuple-unpacked variables from the call to p_thr.process_files
(e.g., "results, errors, skips = p_thr.process_files(...)" and similar
unpackings at the other locations) include values that are never used; rename
the unused ones to underscore-prefixed names (for example "_results", "_errors",
or "_skips") so only the actually-read variables remain non-underscored; update
the specific unpackings that call p_thr.process_files (and any equivalent test
unpackings around the noted locations) to use the underscore-prefixed names for
unused values to silence the RUF059 lint warnings.

---

Outside diff comments:
In `@refactron/core/refactron.py`:
- Around line 270-307: This file fails the project's Black formatting check;
re-run Black on the module containing the functions shown (e.g., around
_analyze_file and process_file_wrapper which is passed into
ParallelProcessor.process_files) and commit the reformatted result using the
project's Black configuration (ensure target-version includes py38, py39, py310,
py311) so CI/pre-commit passes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a6be5c7e-daea-4ad6-adf5-128fa1e206d1

📥 Commits

Reviewing files that changed from the base of the PR and between 9f20bb0 and dea2f93.

📒 Files selected for processing (3)
  • refactron/core/refactron.py
  • tests/test_config_management.py
  • tests/test_performance_optimization.py


p_thr = ParallelProcessor(max_workers=2, use_processes=False, enabled=True)
results, errors = p_thr.process_files(files, lambda p: (None, None))
results, errors, skips = p_thr.process_files(files, lambda p: (None, None, None))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

RUF059: underscore unused unpacked return values in these tests.

Several tuple-unpacked variables are never read. Rename only the unused ones to underscore-prefixed names to clear Ruff warnings.

♻️ Suggested lint-safe updates
-    results, errors, skips = p_thr.process_files(files, lambda p: (None, None, None))
+    results, errors, _skips = p_thr.process_files(files, lambda p: (None, None, None))

-        results, errors, skips = pp.process_files([Path("a.py")], raises_func)
+        _results, errors, _skips = pp.process_files([Path("a.py")], raises_func)

-        results, errors, skips = pp.process_files(files, success_func)
+        results, _errors, skips = pp.process_files(files, success_func)

-        results, errors, skips = pp.process_files([Path("a.py"), Path("b.py")], raises_func)
+        _results, errors, _skips = pp.process_files([Path("a.py"), Path("b.py")], raises_func)

-        results, errors, skips = pp.process_files([Path("a.py")], success_func)
+        results, _errors, _skips = pp.process_files([Path("a.py")], success_func)

-            results, errors, skips = pp.process_files([Path("a.py")], success_func)
+            results, _errors, _skips = pp.process_files([Path("a.py")], success_func)

Also applies to: 1229-1229, 1247-1247, 1253-1253, 1268-1268, 1278-1278

🧰 Tools
🪛 Ruff (0.15.9)

[warning] 915-915: Unpacked variable skips is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_config_management.py` at line 915, The tuple-unpacked variables
from the call to p_thr.process_files (e.g., "results, errors, skips =
p_thr.process_files(...)" and similar unpackings at the other locations) include
values that are never used; rename the unused ones to underscore-prefixed names
(for example "_results", "_errors", or "_skips") so only the actually-read
variables remain non-underscored; update the specific unpackings that call
p_thr.process_files (and any equivalent test unpackings around the noted
locations) to use the underscore-prefixed names for unused values to silence the
RUF059 lint warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant